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Abstract. In this paper we present the asymptotic enumeration of RNA structures with pseu- 
doknots. We develop a general framework for the computation of exponential growth rate and 
the sub exponential factors for fc-noncrossing RNA structures. Our results are based on the 
generating function for the number of fc-noncrossing RNA pseudoknot structures, Sfc(n), derived 
in 1171 . where k — 1 denotes the maximal size of sets of mutually intersecting bonds. We prove a 
functional equation for the generating function JZ n >o Sfc(n)z" and obtain for k = 2 and k = 3 
the analytic continuation and singular expansions, respectively. It is implicit in our results that 
for arbitrary k singular expansions exist and via transfer theorems of analytic combinatorics we 
obtain asymptotic expression for the coefficients. We explicitly derive the asymptotic expres- 
sions for 2- and 3-noncrossing RNA structures. Our main result is the derivation of the formula 



RNA molecules are particularly fascinating since they represent both: genotypic legislative via 
their primary sequence and phenotypic executive via their functionality associated to 2D or 3D- 
structures, respectively. Accordingly, it is believed that RNA may have been instrumental for early 
evolution-before Proteins emerged. The primary sequence of an RNA molecule is the sequence of 
nucleotides A, G, U and C together with the Watson-Crick (A-U, G-C) and (U-G) base pairing 
rules specifying the pairs of nucleotides can potentially form bonds. Single stranded RNA molecules 
form helical structures whose bonds satisfy the above base pairing rules and which, in many cases, 
determine their function. For instance, RNA ribozymes are capable of catalytic activity, cleaving 
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other RNA molecules. RNA secondary structure prediction is of polynomial complexity [5T] which 
is result from the fact that in secondary structures no two bonds can cross. Leaving the paradigm of 
RNA secondary structures, i.e. studying RNA structures with crossing bonds, the RNA pscudoknot 
structures, poses challenging problems for computational biology. Prediction algorithms for RNA 
pseudoknot structures arc much harder to derive since there exists no a priori tree-structure and the 
subadditivity of local solutions is not guaranteed. RNA pscudoknot structures can be categorized 
in terms of the maximal size of sets of mutually crossing bonds j!7l . To be precise a fc-noncrossing 
RNA structure has at most fc — 1 mutually crossing bonds and a minimum bond-length of 2, i.e. for 
any i, the nucleotides i and i + 1 cannot form a bond. The asymptotics of fc-noncrossing RNA 
structures is of central importance in this context. The key question is how to decompose a fc- 
noncrossing RNA structure into a collection of sub-structures (which can easily be computed) , and 
what are the properties of this decomposition. Given such a decomposition we can predict the 
factors and reassemble the corresponding pscudoknot structure. A first step towards finding such 
decompositions is to have information about the cardinalities of the respective sets of structures 
involved. The asymptotic analysis of fc-noncrossing RNA structures is based on their generating 




FIGURE 1. RNA secondary structures. Diagram representation (top): the primary 
sequence, GAGAGCCUUUGGACCUCA, is drawn horizontally and its backbone 
bonds are ignored. All bonds are drawn in the upper halfplane and secondary structures 
have the property that no two arcs intersect and all arcs have minimum length 2. Outer 
planar graph representation (bottom). 

function, obtained in |17| . The particular formulas are however alternating sums, which make 
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even the computation of the exponential growth rate a nontrivial task. In this paper we develop 
a framework for the asymptotic enumeration of /c-noncrossing RNA structures. Before we go into 
this in more detail, let us first provide some background on coarse grained RNA structures and 
put our results into context. 



1.1. RNA secondary structures or the universality of the square root. About three 
decades ago Waterman et.al. pioneered the concept of RNA secondary structures [26l[22. 21 , 33lfl6], 
The key property of secondary structures is best understood, considering a structure as a diagram, 
which is obtained as follows: one draws the primary sequence of nucleotides horizontally and ig- 
nores all chemical bonds of its backbone. Then one draws all bonds, i.e. nucleotide interactions 
satisfying the Watson-Crick base pairing rules (and G-U pairs) as arcs in the upper halfplane, 
effectively identifying structure with the set of all arcs. In this representation, RNA secondary 
structures have then following property: there exist no two arcs {12,32), where i\ < 31 and 

ii < 32 with the property i\ < 12 < ji < 32 and all arcs have at least length 2. Equivalently, there 
exist no two arcs that cross in the diagram representation of the structure, see Figure Q] Basically, 
all combinatorial properties of secondary structures are derived from Waterman's basic recursion 



where S2(n) denotes the number of RNA secondary structures. Eq. (jl.lj) is an immediate conse- 
quence considering secondary structures as Motzkin paths, i.e. peak-free paths with up, down 
and horizontal steps that stay in the upper halfplane, starting at the origin and end on the 
x-axis. The recursion is in particular the key for all asymptotic results since it allows to ob- 
tain an implicit function equation for the generating function and subsequent application of 
Darboux-type theorems |14[ I28j . If specific conditions are being imposed, for instance mini- 
mum loop-size or stack length, it is straightforward to translate these constraints into restricted 
Motzkin paths, all of which satisfy some variant of eq. (jl.ip . As a result, all asymptotic for- 
mulae are of the same type: a square root, that is, the asymptotic behavior is determined by 
an algebraic branch singularity with the sub exponential factor n~z. For instance, the number 
of RNA secondary structures having a minimum hairpin-loop length of 3 and minimum stack- 
length 2 is asymptotically given by S 2 (n) ~ 1.4848 n~ 2 1.8488™ [H]. The number of RNA sec- 
ondary structures having exactly £ isolated vertices, S2(n,£), satisfies the two term recursion 
(n - £){n - £ + 2) S 2 (n,£) - (n + £)(n + £ - 2) S 2 (n - 2, £) = \JJ] and Waterman proved in 



n-2 



(1.1) 




s=0 
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FIGURE 2. Universality of the square root. We display the branch-point singularity 
(here at p2 = 3 ~ 2 ), i.e. the critical singularity for the asymptotics of RNA secondary 
structures. All singularities arising from enumeration of certain classes of secondary 

_ 3 

structures produces this type, whence the sub exponential factor n 2 . 
[33] the following beautiful formula 

„ / n±l \ /n±l _ 1\ 

< L2) s ^>=— (^0(4^) 

resulting from a bijection between secondary structure and linear trees. In |23j it is shown that 
the prediction of secondary structures can be obtained in polynomial time and yet again eq (|1.1D 
is central for all folding algorithms [2Ql Q31 [23l [31j [H 03] . 

1.2. Beyond secondary structures. While the concept of secondary structure is of fundamental 
importance it is well-known that there exist additional types of nucleotide interactions Q] . These 
bonds are called pseudoknots [3] and occur in functional RNA (RNAseP [5]), ribosomal RNA 
[8] and are conserved in the catalytic core of group I introns. In plant viral RNAs pseudoknots 
mimic tRNA structure and in in vitro RNA evolution [6] experiments have produced families of 
RNA structures with pscudoknot motifs, when binding HIV-1 reverse transcriptase. Important 
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FIGURE 3. Beyond secondary structures: an RNA bi-secondary structure as the gener- 
alization from outer-planar to planar diagrams. We display a secondary RNA structure 
(top) and a bi-secondary structure (bottom). Reflecting the arcs (3, 8) and (9, 12) w.r.t. 
the a:-axis yields two secondary structures. 

mechanisms like ribosomal frame shifting [7J also involve pseudoknot interactions. There exist 
several prediction algorithms for pseudoknot RNA structures [29 , 32, 30] H7J all of which can iden- 
tify particular respective pseudoknot motifs. Stadler et al. [13] suggested a classification of RNA 
pseudoknot-types based on a notion of inconsistency graphs and computed the upper bound of 
4.7613 for the exponential growth factor of bi-secondary structures. Bi-secondary structures are 
"superpositions" of two secondary structures, i.e. they can be drawn as a set of non intersecting 
arcs in the upper and lower half plane, respectively. Figure [3] shows how bi-secondary structures 
naturally arise when passing from outer-planar to planar diagram representations. The concept 
of fc-noncrossing RNA structures generalize both: secondary and bi-secondary structures, respec- 
tively. While RNA secondary structures are precisely 2-noncrossing RNA structures, bi-secondary 
structures correspond to planar 3-noncrossing RNA structures. The key advantage of fc-noncrossing 
RNA structures is that their defining property is intrinsically local. It can be expected that this 
facilitates fast folding algorithms. In Figure[2]we contrast all three structural concepts, secondary, 
bi-secondary and fc-noncrossing RNA structures. 



1.3. Organization and main results. In Section [5] provide the necessary background on fc- 
noncrossing RNA structures and the generating function X)n>o ^k(n)z n . In Section [3] we derive 
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FIGURE 4. fc-noncrossing RNA structures, (a) secondary structure (with isolated labels 
3,7,8,10), (b) bi-secondary structure, 2,9 being isolated (c) 3-noncrossing structure, 
which is not a bi-secondary structure In fact, this is the smallest 3-noncrossing RNA 
structure which is not a bi-secondary structure. 

the exponential factor for fc-noncrossing RNA structures, i.e. we compute the base at which fc- 
noncrossing RNA structures asymptotically grow. The exponential factor is the key result for 
all complexity considerations arising in the context of prediction algorithms for RNA pseudoknot 
structures. To make it easily accessible to a broad readership we give an elementary proof based 
on real analysis and transformations of the generating function. Central to our proof is a func- 
tional identity (Lemma [I} whose true power is revealed only later in Section [U where it is put 
in the context of analytic functions. Remarkably, Stadler's upper bound for bi-secondary struc- 
tures coincides with the exact exponential factor obtained via Theorem [2] for 3-noncrossing RNA 
structures up to O(10~ 2 ). In Section [4] we compute the asymptotics for 2-noncrossing RNA struc- 
tures and 3-noncrossing RNA structures, respectively. Since the method via implicit functions 
used for secondary structures [14] does not work for fc > 2 we develop a new approach which is 
based on concepts developed by Flajolet et.al. using singular expansions and transfer theorems 
[Tj] [551 HH 121 H] ■ The basic strategy is as follows: we first obtain an analytic continuation f(z) 
generalizing the functional equation of Lemma [1] to complex indeterminant z. For k = 3 we obtain 
an expression involving the Legendre polynomial P'[ 1 (z) indicating that the type of singularity 

2 

is fundamentally different from the branch-point singularity of the square root. In Figure [5] we 
display the analytic continuation of X)n>o ^3( n ) zn at the dominant singularity, p$ = 5 ~?f^~ and its 
singular expansion. We proceed by proving that f(z) the dominant singularity is indeed unique. 
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The next step is to establish that there exists a singular expansion for f{z), i.e. there exists a 
function h such that f(z) = 0(h(z)) at the dominant singularity (see Section|4]). Intuitively, this 
singular expansion approximates f{z) well enough to retrieve precise asymptotic expansions of the 
coefficients via transfer theorems [TTJ 023 El] • The existence of the singular expansion can be de- 
duced from the particular form of the generating function for fc-noncrossing RNA structures. Due 
to Lemma [5] it suffices to analyze the coefficients /3(2n, 0), which are known via the determinant 
formula of Bessel functions in eq. (|2.3p . We then proceed using this particular form of /3(2n, 0) to 
explicitly compute the singular expansion and show in the process how the logarithmic term arises 
naturally from elementary calculations. It should be remarked that we use the transfer theorems 
since our generating function is the composition of two analytic functions We then show 

that the type of the singualrity of f(d(z)) coincides with the type of singularity of the function 
f(z). The phenomenon of the persistence of the singularity of the "outer" function f(z) is known 
as the supercritical case [llj . This will allow us to obtain the asymptotics of the coefficients of the 
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function /(i9(z)). One main result of the paper is the formula 



(1.3) 



S 3 (n) 



10.4724 4! 



n(n — 1) ... (n — 4) 



In order to assess the quality of our formula, let us list the sub exponential factors for k = 2 and 
k = 3, obtained from Theorem [3] and Theorem [SJ 
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In the table below we list the sub exponential factors, i.e. we compare for k = 2, 3 the quantities 
S fc (n)/(^^) n and s fe (n), resp( 
fc-noncrossing RNA structures. 



Sfc(n)/( 3+ 2 v/ ^ )" and s k (n), respectively. S2(n) and 83(71) are given by the generating function of 
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2. fc-NONCROSSING RNA STRUCTURES 



Suppose we are given the primary RNA sequence 

AACCAUGUGGUACUUGAUGGCGAC . 
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We then identify an RNA structure with the set of all bonds different from the backbone-bonds 
of its primary sequence, i.e. the arcs + 1) for 1 < i < n — 1. Accordingly an RNA structure 
is a combinatorial graph over the labels of the nucleotides of the primary sequence. These graphs 
can be represented in several ways. In Figure [6] we represent a particular structure with loop- loop 
interactions in two ways. In the following we will consider structures as diagram representations 



a a c c a u ou a a u If fly euSISf BH 

FIGURE 6. A 3-noncrossing RNA structure, as a planar graphs (top) and as a diagram (bottom) 

of digraphs. A digraph D n is a pair of sets VD n ,En n ^ where Vd„ = {1, •■■,«-} and Eo n C {(i, j) \ 
1 < i < j < n }- Vd„ and are called vertex and arc set, respectively. A fc-noncrossing digraph 
is a digraph in which all vertices have degree < 1 and which does not contain a k-set of arcs that 
are mutually intersecting, or more formally 

(2.1) /3 (Vi)jVi)) (V 2 jjr 2 )j • ■ ■ i (Vfc)jVfc)i in < V2 < • ■ ■ < i'rjs < jri < J'r 2 < ' ' ' < jr k ■ 

We will represent digraphs as a diagrams (Figure [S]) by representing the vertices as integers on a 
line and connecting any two adjacent vertices by an arc in the upper-half plane. The direction of 
the arcs is implicit in the linear ordering of the vertices and accordingly omitted. 

Definition 1. An fc-noncrossing RNA structure is a digraph in which all vertices have degree 
< 1, that does not contain a fc-set of mutually intersecting arcs and 1-arcs, i.e. arcs of the form 
+ 1), respectively. We denote the number of RNA structures by Sfe(n) and the number of 
RNA structures with exactly I isolated vertices and with exactly h arcs by Sk(n,£) and S' k (n,h), 
respectively. Note that S' k (n, h) = Sk{n, n — 2h). We call an RNA structure restricted if and only 
if it does not contain any 2-arcs, i.e. an arc of the form (i, i + 2). 





10 



EMMA Y. JIN AND CHRISTIAN M. REIDYS * 



Let /fe(n, £) denote the number of fc-noncrossing digraphs with £ isolated points. We have shown 
in [17] that 

(2.2) fk(n,£)=( r l)f k (n-e,0) 

x n 



(2.3) det^ C^) - I i+j (2x)} = £ f k ( n , 0) 



(2.4) e x A e t[I i . j {2x)-I i+j {2x)t; j i 1 = (£ |-)(£ f k (n, 0)^) = £ ^/ fc (n,^) ^ 

£>0 ' n>l n ' n>l Ki=0 ) ' 

In particular we obtain for k = 2 and fc = 3 



(2.5) h(n,£)= Qc (n _, )/2 and / 8 M=Q 



~" 2 ' * — 2~~ ~9 rl 



where C m = m * 1 ( 2 ^ 1 ) is the mth Catalan number. The derivation of the generating function 
of fc-noncrossing RNA structures, given in Theorem [T] below uses advanced methods and novel 
constructions of enumerative combinatorics due to Chen et.al. [34\ I15j and Stanley's mapping 
between matchings and oscillating tableaux i.e. families of Young diagrams in which any two 
consecutive shapes differ by exactly one square. The enumeration is obtained using the reflection 
principle due to Gessel and Zeilbcrger [15j and Lindstrom [2] combined with an inclusion-exclusion 
argument in order to eliminate the arcs of length 1. In [17] generalizations to restricted (i.e. where 
arcs of the form + 2) are excluded) and circular RNA structures are given. 

Theorem 1. [T? Let k G N, k > 2, then the number of RNA structures with £ isolated vertices, 
Sfe(n,£) 7 is given by 

(2.6) S k (n,l)= [ b )fk(n-2b,£), 

where fk (n — 26, £) is given by the generating function in eq. (|2.3|) . Furthermore the number of 
k-noncrossing RNA structures, Sfc(n) is 

[n/2j , _ 7 \ (n-2b \ 

(2-7) S,H = ]T (-1)"P )\J2fk(n-2b,£)\ 



6=0 v ' V 1=0 



where {Y^l = Q h fk{n — 26, £)} is given by the generating function in eq. 
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3. The exponential factor 



In this section we obtain the exponential growth factor of the coefficients Sfc(n). Let us begin by 
considering the generating function J2n>o ^k(n)z n as a power series over R. Since Yl n >o ^k(n)z n 
has monotonously increasing coefficients limn^oo Sfc(n) ™ exists and determines via Hadamard's 
formula its radius of convergence. As we already mentioned, due to the inclusion-exclusion form 
of the terms Sfc(n), it is not obvious however, how to compute this radius of convergence. Our 
strategy consists in first showing that Sfe(n) is closely related to /fc(2n, 0) via a functional relation 
of generating functions. 

Lemma 1. Let z be an indeterminant over R and w £ R a parameter. Let furthermore pk(w) 
denote the radius of convergence of the power series $3re>oEfc<n/2 ->fc( n > h)w 2h ]z n . Then for \z\ < 
Pk(w) holds 

(3.1) £ £ S k {n,hW h z n = w2z2 \ Z + 1 E/*(2M) 

n>0h<n/2 n>0 

In particular we have for w = 1 , 

(3.2) ^S fc (n)z" +1 = E/ fe (2n,0) 

n>0 n>0 



wz 




The proof of Lemma Q] is a bit technical and consists in a series of changes of orders of summations 
and Laplace transforms. We give the proof in Sectional In Section [4] we will employ basic complex 
analysis and extend eq. (|3.1[) to complex z. Lemma[T]is the key to prove Theorem [2] below, where 
we obtain the exponential factor for any k > 1 . In its proof we recruit the Theorem of Pringsheim 
[lOj which asserts that a power series X) n >o a « z ™ with o, n > has its radius of convergence as 
dominant (but not necessarily unique) singularity. 

Theorem 2. Let k be a positive integer, k > 1 and let ru be the radius of convergence of the power 
series X) n >o fk(^ n i 0)z 2n . Then the power series X) n >o ^>k{n)z n has the real valued, dominant 

//i+J-\ 2 

singularity at p k = — tP* — \ I — tP*- — 1 and for the number of k-noncrossing RNA structures 



holds 

(3.3) S fc (n) 
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We will prove later in Theorem0]aiid Thcorem[5]that for k = 2 and k = 3 the dominant singularities 
P2 and P3 are unique, respectively. 



Proof. Suppose we are given r k , then r k < \ (this follows immediately from C„ ~ 2 2 " via Stirling's 
formula) and obviously, (z — i) 2 + § has no roots over R. The functional identity of Lemma [T] 
allows us to derive the radius of convergence of X)n>o ^k(n)z n . Setting w = 1 Lemma [T] yields 



(3.4) $>(n>™ = , 3 E/*( 2 "»°) 



n>0 



2n 



/fe(2n, 0) is monotone, whence the limit limj^oo /fc(2n, 0) 2» exists and applying Hadamard's for- 
mula: lim„^oo /fc(2n, 0)^ = For z G ffi, wc proceed by computing the roots of 



2 



-2 + 1 



which for < | has the minimal root p k = — t" 4 — I ( — ) — 1 I . We next show that pu is 




indeed the radius of convergence of Yl n >o $k(n)z n . For this purpose we observe that the map 
(3-5) i?: [0, h — ► [0, |], z » ji— T , where t?(p fc ) = r fc 

/ o (Z - ^ J + 4 

is bijective, continuous and strictly increasing. Continuity and strict monotonicity of i? guarantee 
in view of eq. (|3.4[) that pfc, is indeed the radius of convergence of the power series J2 n >o ^>k(n)z n . 
In order to show that pk is a dominant singularity we consider Yl n >o Sfc(n)z™ as a power series over 
C. Since Sfc(n) > 0, the theorem of Pringsheim [10] guarantees that itself is a singularity. By 
construction pk has minimal absolute value and is accordingly dominant. Since S&(n) is monotone 
linin^oo Sfe(n)« exists and we obtain using Hadamard's formula 

(3.6) lim Sk(n)" = — , or cquivalently Sfe(n) ~ j — 

n->oo p k \p k 

from which eq. (|3 . 3[) follows and the proof of the theorem is complete. □ 



4. ASYMPTOTICS OF 3-NONCROSSING RNA STRUCTURES 



In this section we provide the asymptotics for RNA secondary and 3-noncrossing RNA structures. 
For k = 2 and k — 3, i.e. for RNA secondary and 3-noncrossing RNA structures, respectively we 
will explicitly obtain analytic continuations of the power series Y^, n >o ^>2{n)z n and J2 n >o $3(n)z n , 
respectively. As a result the dominant singularity relevant for the asymptotics is known and 
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Theorem [2] becomes obsolete. However, it is not entirely trivial to derive the analytic continuations 
for arbitrary crossing numbers k. In the context of complexity of prediction algorithms for RNA 
pseudoknot structures it suffices to obtain the exponential factor which is given via Theorem [2l 

We begin by revealing the "true" power of Lemma [T] in the context of analytic functions. 

Lemma 2. Let k > 1 be an integer, then we have for arbitrary z G C with the property \z\ < p^ 
the equality 

(4.1) E S ^>" - -2 _"l £ /fe(2n ' 0) 

n>0 ri>0 

/ s 2n 

Proof. The power series J2 n >o Sk(n)z n and J2 n >o fk{^ n i 0) ( z ^-z+i ) are ana lytic in a disc of 
radius < e < pu and according to Lemma [T] coincide on the interval ] — e, e[. Therefore both 
functions are equal on the sequence (i)„ S N which converges to and standard results of complex 
analysis (zeros of nontrivial analytic functions are isolated) imply that eq. (|4.1[) holds for any zeC 
with \z\ < pk, whence the lemma. □ 




The derivation of the sub exponential factors is based on singular expansions in combination with 
a transfer theorem, which recruits Hankel contours, see Figure [71 Let us begin by specifying a 
suitable domain for our Hankel contours tailored for Theorem [3] 

Definition 2. Given two numbers (/>, R, where R > 1 and < (f> < -| and p £ R the open domain 
A p (0, R) is defined as 

(4.2) A p (0, R) = {z\ \z\ <R,z^ Pl \Arg(z - p)\ > 0} 

A domain is a A p -domain if it is of the form A p (</>, R) for some R and <fi. A function is A p -analytic 
if it is analytic in some A p -domain. 

Since the Taylor coefficients have the property 

(4.3) V 7 eC\0; [z n ]f{z)= 1 n [z n ]f(-) , 

7 

we can, w.l.o.g. reduce our analysis to the case where 1 is the dominant singularity. We use the 
notation 

(4.4) = O (g{z)) as z — » p) <^=> ( f(z)/g{z) is bounded as z — > p) . 
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FIGURE 7. Ai-domain enclosing a Hankel contour. We assume z = 1 to be the unique 
dominant singularity. The coefficients are obtained via Cauchy's integral formula and the 
integral path is decomposed in 4 segments. Segment 1 becomes asymptotically irrelevant 
since by construction the function involved is bounded on this segment. Relevant are 
the rectilinear segments 2 and 4 and the inner circle 3. The only contributions to the 
contour integral are being made here, which shows why the singular expansion allows to 
approximate the coefficients so well. 



and if we write f(z) = 0(g(z)) it is implicitly assumed that z tends to a (unique) singularity. 
[z n ] f(z) denotes the coefficient of z n in the power series expansion of f(z) around 0. 

Theorem 3. Let a be an arbitrary complex number in C\Z<o and suppose f(z) = 0((1— z)~ a ), 



then 



[z n ]m 



K n 



a-l 



1 + 




+ 




for some K > 0. 



Suppose r € Z>o, and f(z) = 0((1 — z) r )), then we have 



(4.5) 




for some K > . 



Let us first analyze the case k = 2, which illustrates the general strategy without the technicality 
of establishing the existence of a suitable singular expansion. Here the generating function itself 
can be used directly (i.e. is its own singular expansion). Our particular proof, given in Section [5l 
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exercises the base strategy used in the proof of Theorem [SJ In particular, Theorem 5] improves on 
the quality of approximation providing a sub exponential factor of higher order compared to |14j . 

Theorem 4. The number of RNA secondary i.e. 2-noncrossing RNA structures is asymptotically 
given by 

,A^ cf -> 1-1002 / 1 1 1 5 _ 5 A /3 + V5 N 

(4.6) S 2 (n) ~ ~~^/n~ \Jn~+l _ + + L28^ + 1024^ + ° ( " } 



We next analyze the 3-noncrossing RNA structures. Here the situation changes dramatically 
since it has to be shown that a suitable singular expansion exists. We will prove this using the 
determinant formula arising in the context of the exponential generating function of fk(2n, 0) given 
in cq. ([2~3]) . 



Theorem 5. The number of 3-noncrossing RNA structures is asymptotically given by 

10.4724 -4! / 5 + V2l\ " 



S 3 (n) 



n(n — 1) ... (n — 4) 



Proof. Claim 1. The dominant singularity p3 of the power scries X)n>o Ss(n),2™ is unique. 
In order to prove Claim 1 we use Lemma [3J according to which the analytic function ^(z) is the 
analytic continuation of the power series X)n>o ^3( 71 ) z ™- We P rocee d by showing that S3 (2) has 
exactly 6 singularities in C, all of which have distinct moduli. The first two singularities are the 
roots ofthe quadratic polynomial P(z) = (z— i) 2 +|, given by a.\ = and a 2 = k— i ^ ■ Next 

we observe that the power series J2 n>0 fk{2n, 0)y n has the analytic continuation (obtained 
by MAPLE sumtools) given by 

(4.7) m = -i — U) T 16 ^ , 

Id j/2 

where P™(x) denotes the Legendre Polynomial of the first kind with the parameters v = | and 
m = — 1. v E , (y) has one dominant singularity at y = j^, which in view of = ( g a_ z z + jj 2 induces 

exactly 4 singularities of S3 (2:) = . 4" I ( a _ z . ) I . Indeed, ^(y 2 ) has the two singularities 



2 + 1 l ^Z 2 -2 + l 

C: /3i = \ and /?2 = — \ which produce for ^(z) the four singularities P3 = s ~2* , C2 = 5+ 2 //2 * - 
£ 3 = ~ 3 ~ v ^ and £4 = ■ Therefore all 6 singularities of £3(2:) have distinct moduli and Claim 

1 follows. 
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Claim 2; the singular expansion, ^(z) is A j_ (0, i?)-analytic and has the singular expansion (1 — 
16*) 4 k(l=hi)- 

(4.8) 



VzeA^R); tf(z)=0( (l-16z) 4 ln ( 

First A j_ ((/), i?)-analyticity of the function (1 — 16z) 4 In ( izygg ) * s obvious. We proceed by proving 
that (1 — 16z) 4 In I 1 _\ 6z ) is the singular expansion. Using the notation of falling factorials (n — 



12(n- l) 4 (2n + l) 



1)4 = (n — l)(n — 2)(n — 3)(n — 4) we observe 
f 3 (2n,0) = C n+2 C n -Cl +1 



2n 

(n - 1)4 (n + 3)(n + l) 2 (n + 2) 2 ^ n 
With this expression for / 3 (2n,0) we arrive at the formal identity 



]ri6-"./3(2n,0); 



n>5 



n>5 

+E 

n>5 



16 



1 



12(n- l) 4 (2n+ 1) (In 



4! 1 1 



(n- 1)4 (n + 3)(n+l) 2 (n + 2) 2 V" / (n-l) 4 7rn 
4! 11 



(n — 1)4 7r n 



where f(z) = 0(g(z)) denotes that the limit f(z)/g(z) is bounded for z — ► 1, eq. (|4.4[) . It is clear 
that 

16"" 



11 > 5 



- E 



12(n - l) 4 (2n + 1) /2n 



4! 1 1 

(n - 1)4 (n + 3)(n + l) 2 (n + 2) 2 \n J (n -l) 4 nn 



z n ) 



n>5 



16" 



1 



12(n- l) 4 (2n+ 1) (2n 



4! 1 1 



(n - 1)4 (n + 3)(n + l) 2 (n + 2) 2 V n / (n - 1) 4 vr n 
for some fc < 0.0784. Therefore we can conclude 
(4.9) 



< K 



£ 16-"/ 3 (2n,0)z" = 0(£ TT^Wz-^ 



(n — 1)4 7r n 



n>5 n>5 

We proceed by interpreting the power series on the rhs, observing 



(4.10) 



Vn>5 



z n ] {l-zf In. 



1 



4! 



1 — z ) (n — 1) . . . (n — 4) n 

whence ^(1 — z) A In jr^) is the unique analytic continuation of ^ n>5 („li) 4 \ Using the 

scaling property of Taylor coefficients [z n ]f(z) = 7 n [z™]/(-) we obtain 



(4.11) 



Vz g A±((t>,R); V{z) = (l-16z) 4 ln 



1 



1 - 16z 
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Therefore we have proved that (1 — 16z) 4 ln( 1 _\ 6z ) is the singular expansion of *B(z) at z = j^, 
whence Claim 2. Our last step consists in verifying that the type of the singularity does not 
change when passing from ^>(z) to S 3 (z) = z i^ z+1 ^ / (( z 'iJ z+1 ) 2 )- That is, we show that the 
singular expansion is not affected by substituting $(z) = ( z ^_ z z+1 ) 2 - 

Claim 3: the singularity persists. For z £ A j_ ((f>, R) we have S 3 (z) = O ( (1 — — ) 4 ln( , ^ 
To prove the claim we first observe that Claim 2 and Lemma [2] imply 



5 3 (^) = O 



2 + 1 



1 - 16(- 



z + V 



1 - 16(^+r) : 



The Taylor expansion of q(z) = 1 - 16( z ±_ z z+1 ) 2 at p 3 is given by = ^^=( / 9 3 -z) + 0(z- ( o 3 ) 2 
and setting a = 5 _^i we compute 



z + 1 



g(z) 4 ln-5— 

9(2) 



(q(p 3 - z) + Q(z - p 3 ) 2 ) 4 ln a(p3 _, )+ 1 0(z _ P3) . 
(z - p 3 ) 2 + (2p 3 - l)(z - p 3 ) + p 2 _ p 3 + 1 

([a + Q(z - p 3 )](p 3 - z) 4 In [a+0{z _l 3)](p3 _ z) ) 



0(z - p 3 ) + pi - p 3 + 1 



0((p 3 -z) 4 ln. 



P3 - 2 



whence Claim 3. Now we are in the position to employ Theorem [31 and obtain for S 3 (n) 



S 3 (n)~K'[z n ] (p 3 -z) 4 ln 



P3 - Z 



K 



4! 



n(n — 1) ... (n — 4) p 3 



Of course if' can be computed from Theorem [U explicitly K' = 10.4724 and the proof of the 
Theorem is complete. □ 



5. Proofs 



Proof of Lemma [TJ First we observe that for z, w € [—1, 1] the term w 2 z 2 — z + 1 is strictly 
positive. We set 

(5.1) F k (z 7 w) = Y / E Sk(n,h)w 2h z n 

n>0 h<n/2 
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and compute 

= E E D- 1 ) J ( n 70(2f^) / * (2 ^-^ ^ 

n>0h<n/2j=0 V J ' V V J ' 

n — j\ I n — 2j 



n>0 j<n/2 h=j 
n/2 



= E E ; J JU"_ ■ ) JAP("-A0)» a v 



j>0 n>2j fc=j 



j>0 J n>2j h=j J / / \ J I 



We shift summation indices n' — n — 2j and h' = h — j and derive for the rhs the following 
expression 



D-iy^ E <«' +i) ! E ( 2( ;: ,) ap(» - 

j>0 J ' n'>0 h=j V V ,/ '' / 



■ n/2-j=n'/2 



j>0 J n'>0 I h'=0 v 

The idea is now to interpret the term Ylh'=o i.2h') f k &h' , 0)w 2h ^ as a product of the two power 
series e 2 and E„>o A'( 2n ' °)^)T : 

E^Eapm)^ - E E 

£>Q n>0 ra'>0 2n+£=ra' ^ v ' ' 

' n'/2 



E<E(£)APM) 



n'>0 I n=0 



We set rj n i = |XX=0 (2ft') fkfth'i 0)w 2h '|. By assumption we have \z\ < Pk{w) and we next derive, 
using the Laplace transformation and interchanging integration and summation 



n poo 



(5.2) J2 ^+3)^ = I E W^-^dt 



ni n — n. 

n'>0 u n'>0 
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Since \z\ < Pk(w) the above transformation is valid and using 

„'>0 I n=0 v 7 £>0 n>0 V ' 



we accordingly obtain 

(5-4) ^rtJ-^-Ve-tdt = / V £ A(2n,0)^^e 



n'>0 u n>0 

The next step is to substitute the term X)n'>o( n ' +i)' 7 7nfpT m e{ l" Q5.2p . whence consequently 



j>0 u n>0 

1/0 j>0 J n>0 v ; 



The summation over the index j is just an exponential function and we derive 

— - (wzt) 2n 



/ e-^ 2 -^^/ fc (2n,0) 

•/0 „^ n 



(2n)I 

n>0 v ; 



/>oo 1 / \ 2n 

= / e-t'^'^^o)^ ) (( W 2 z 2 -z+l)i) 2 ^ 

Jo ^ (2n)! V w z " -2 + 1/ 

We proceed by transforming the integral introducing it = (w 2 z 2 — z+l)t, i.e. = (w 2 z 2 — z+l)^ 1 du 
and accordingly arrive at 

2n 



= E Af 2 ".")^ f a 5 a 1 x i < 2 "> ! 

(2n)! \wrz^ — z + 1/ ■urz- 2 — z + 1 

-TTiE/^o) 



n>0 

w z z^ — 2 + 1 * — ' \rz z — z + 1 

n>0 



whence the lemma. □ 



Proof of Theorem [4J Wc shall begin by deriving the asymptotics of /2(2n, 0) = C„. Since 
E„>o (I") 2 " = (! - ^J - *. we observe 

C n = -!—[z n ](l-4z)-i 
n + 1 



2(1 
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and according to Theorem [3] we can express C n asymptotically as 

(5 ' 5) Cn ~ 7™ (n + 1 ~ 8n(n+T) + 128^3 + 1024^4 + °^ n ~^ 

The generating function of the Catalan numbers is given by 



(5.6) ny) = T,^y" = 1 —4 E ^ 

n>0 Z1J 

having a branch-point singularity at j. Lemma [2] allows us to express the analytic continuation of 
£ n > S 2 (n)z" via*: 

( 5 - 7 ) - 2 1 -r *(f 2 ^— ) 

z z — z + 1 \ z z — z + 1 J 



(5.8) V V 



1 1 



z + 1 



Z 2 -2 + l 



, 2 2 + 1 , 

The explicit form of S 2 (;z) allows us to conclude that p 2 = 3 ~ 2 i s the unique dominant singularity. 
We denote the map z i— > ( z ^_^ +1 ) 2 by $ and compute the first terms of the Taylor series at z = p 2 . 
i.e. where $(p 2 ) = jq- 

1 5 -4- 

(5-9) = - + —^(z ~ P2) + (z- p 2 f T{z) , 

where T(z) = ^2 i>0 Ci{z — p 2 ) 2 , c% G R- Analyzing ^(z) in an intersection of an e-disc around p 2 
with A P2 produces 



(5-10) S 2 (z) = * —g- 

from which we immediately conclude 

(5.11) S 2 (p 2 z) = 0(*(4z)) . 

Theorem |3] and the scaling property of Taylor coefficients [z n ]f(z) = 7 n [z™]/(^) imply 

(5.12) K[z n ] E 2 (p 2 z) ~ [z n ] (4z), for some AT > 
and we accordingly arrive substituting a = — ^ at 

K ( 1 1 1 5 / 3 + V5Y 



[* n ]Ha(*) 



V^V^+l 8ra(n + l) 128n 3 1024?i 4 



+ 0(n~ 5 )^ 
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for some K > 0. Via Theorem [TJ the coefficients S2{n) are explicitly known and we compute 
K = 1 .9572 from which the theorem follows. □. 
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